AITopics

Technology: Information Technology > Artificial Intelligence (0.39)

Neural Information Processing SystemsAug-15-2025, 07:54:32 GMT

631f99d8e860054410c239fc90d18270-Paper-Conference.pdf

belief model, covariate shift, trajectory, (13 more...)

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(8 more...)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

You, Won-Sang, Ha, Tae-Gwan, Lee, Seo-Young, Kim, Kyung-Joong

Automatic Curriculum Design for Zero-Shot Human-AI Coordination

arXiv.org Artificial IntelligenceMar-10-2025

Zero-shot human-AI coordination is the training of an ego-agent to coordinate with humans without using human data. Most studies on zero-shot human-AI coordination have focused on enhancing the ego-agent's coordination ability in a given environment without considering the issue of generalization to unseen environments. Real-world applications of zero-shot human-AI coordination should consider unpredictable environmental changes and the varying coordination ability of co-players depending on the environment. Previously, the multi-agent UED (Unsupervised Environment Design) approach has investigated these challenges by jointly considering environmental changes and co-player policy in competitive two-player AI-AI scenarios. In this paper, our study extends the multi-agent UED approach to a zero-shot human-AI coordination. We propose a utility function and co-player sampling for a zero-shot human-AI coordination setting that helps train the ego-agent to coordinate with humans more effectively than the previous multi-agent UED approach. The zero-shot human-AI coordination performance was evaluated in the Overcooked-AI environment, using human proxy agents and real humans. Our method outperforms other baseline models and achieves a high human-AI coordination performance in unseen environments.

agent, coordination, layout, (14 more...)

2503.07275

Genre: Research Report > New Finding (0.68)

Industry:

Leisure & Entertainment > Games (0.93)
Education (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Neural Information Processing SystemsFeb-5-2025, 12:14:41 GMT

Reviews: On the Utility of Learning about Humans for Human-AI Coordination

Summary: The paper investigates the usefulness of modeling human behavior in human-ai collaborative tasks. In order to study this question, the paper introduces an experimental framework that consists of: a) modeling human behavior using imitation learning, b) training RL agents in several modes (self-play, trained agains human imitator, etc.), c) measuring the joint performance of human-AI collaboration. Using both simulation based experiments and a user study the paper showcases the importance of accounting for human behavior in designing collaborative RL agents. Comments: The topic of the paper is interesting and important for modern hybrid human-AI decision making systems. This seems like a well written paper with solid contributions: to the best of my knowledge, no prior work has systematically investigated the utility of human modeling in the context of human-AI collaboration in RL.

human behavior, human-ai coordination, modeling human behavior, (11 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.74)

Neural Information Processing SystemsFeb-5-2025, 12:14:40 GMT

Reviews: On the Utility of Learning about Humans for Human-AI Coordination

The paper proposes a new evaluation framework and benchmark for multi-agent learning settings where coordination with team mates is required to complete a task, and carefully evaluates state-of-the-art learning approaches in this novel setting, including evaluation with human players. All reviewers agreed that the contributions made by the paper are high, and are likely to influence future work in this field. In the initial reviews, several areas of improvement were noted, including to precisely explain the relationship of this work to the substantial amount of prior work in human-robot and human-AI interaction, several requests for clarification, and suggestions for further experimentation. The reviewers were content with the author response, and in particular the provided clarification of the relationship to prior work and overall contribution of the paper. I encourage the authors to carefully consider all reviewer comments when preparing the camera ready version.

human-ai coordination, learning, utility, (4 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (0.67)
Information Technology > Artificial Intelligence > Machine Learning (0.67)

Hao, Xin, Nakisa, Bahareh, Rastgoo, Mohmmad Naim, Dazeley, Richard

IReCa: Intrinsic Reward-enhanced Context-aware Reinforcement Learning for Human-AI Coordination

arXiv.org Artificial IntelligenceAug-27-2024

In human-AI coordination scenarios, human agents usually exhibit asymmetric behaviors that are significantly sparse and unpredictable compared to those of AI agents. These characteristics introduce two primary challenges to human-AI coordination: the effectiveness of obtaining sparse rewards and the efficiency of training the AI agents. To tackle these challenges, we propose an Intrinsic Reward-enhanced Context-aware (IReCa) reinforcement learning (RL) algorithm, which leverages intrinsic rewards to facilitate the acquisition of sparse rewards and utilizes environmental context to enhance training efficiency. Our IReCa RL algorithm introduces three unique features: (i) it encourages the exploration of sparse rewards by incorporating intrinsic rewards that supplement traditional extrinsic rewards from the environment; (ii) it improves the acquisition of sparse rewards by prioritizing the corresponding sparse state-action pairs; and (iii) it enhances the training efficiency by optimizing the exploration and exploitation through innovative context-aware weights of extrinsic and intrinsic rewards. Extensive simulations executed in the Overcooked layouts demonstrate that our IReCa RL algorithm can increase the accumulated rewards by approximately 20% and reduce the epochs required for convergence by approximately 67% compared to state-of-the-art baselines.

agent, ai agent, intrinsic reward, (16 more...)

2408.07877

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Hu, Hengyuan, Sadigh, Dorsa

Language Instructed Reinforcement Learning for Human-AI Coordination

arXiv.org Artificial IntelligenceJun-10-2023

One of the fundamental quests of AI is to produce agents that coordinate well with humans. This problem is challenging, especially in domains that lack high quality human behavioral data, because multi-agent reinforcement learning (RL) often converges to different equilibria from the ones that humans prefer. We propose a novel framework, instructRL, that enables humans to specify what kind of strategies they expect from their AI partners through natural language instructions. We use pretrained large language models to generate a prior policy conditioned on the human instruction and use the prior to regularize the RL objective. This leads to the RL agent converging to equilibria that are aligned with human preferences. We show that instructRL converges to human-like policies that satisfy the given instructions in a proof-of-concept environment as well as the challenging Hanabi benchmark. Finally, we show that knowing the language instruction significantly boosts human-AI coordination performance in human evaluations in Hanabi.

machine learning, natural language, reinforcement learning, (20 more...)

2304.07297

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (0.51)

Industry: Leisure & Entertainment > Games > Computer Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMay-22-2023

PECAN: Leveraging Policy Ensemble for Context-Aware Zero-Shot Human-AI Coordination

Lou, Xingzhou, Guo, Jiaxian, Zhang, Junge, Wang, Jun, Huang, Kaiqi, Du, Yali

Zero-shot human-AI coordination holds the promise of collaborating with humans without human data. Prevailing methods try to train the ego agent with a population of partners via self-play. However, these methods suffer from two problems: 1) The diversity of a population with finite partners is limited, thereby limiting the capacity of the trained ego agent to collaborate with a novel human; 2) Current methods only provide a common best response for every partner in the population, which may result in poor zero-shot coordination performance with a novel partner or humans. To address these issues, we first propose the policy ensemble method to increase the diversity of partners in the population, and then develop a context-aware method enabling the ego agent to analyze and identify the partner's potential policy primitives so that it can take different actions accordingly. In this way, the ego agent is able to learn more universal cooperative behaviors for collaborating with diverse partners. We conduct experiments on the Overcooked environment, and evaluate the zero-shot human-AI coordination performance of our method with both behavior-cloned human proxies and real humans. The results demonstrate that our method significantly increases the diversity of partners and enables ego agents to learn more diverse behaviors than baselines, thus achieving state-of-the-art performance in all scenarios. We also open-source a human-AI coordination study framework on the Overcooked for the convenience of future studies.

large language model, machine learning, natural language, (17 more...)

2301.06387

Country:

Asia > China > Beijing > Beijing (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)